AIBase
Home
AI NEWS
AI Tools
AI Models
MCP
AI Services
AI Compute
AI Tutorial
EN

AI News

View More

Harvard University to Release Massive Free AI Training Dataset Funded by OpenAI and Microsoft

Harvard University announced on Thursday the release of a high-quality dataset containing nearly one million public domain books, which anyone can use to train large language models and other AI tools. This dataset was created by Harvard's newly established Institutional Data Initiative and is funded by Microsoft and OpenAI. The included books are all works scanned by the Google Books Project that are no longer under copyright protection.

14.2k yesterday
Harvard University to Release Massive Free AI Training Dataset Funded by OpenAI and Microsoft

Models

View More

GPT OSS 120B

Openai

GPT OSS 120B

$0.63

Input tokens/M

$3.15

Output tokens/M

131

Context Length

Qianfan-PublicOpinion-Classification

Baidu

Qianfan-PublicOpinion-Classification

-

Input tokens/M

-

Output tokens/M

32

Context Length

Claude 3.7 Sonnet

Anthropic

Claude 3.7 Sonnet

$21

Input tokens/M

$105

Output tokens/M

200

Context Length

o1

Openai

o1

$105

Input tokens/M

$420

Output tokens/M

200

Context Length

AIBase
Empowering the future, your artificial intelligence solution think tank
English简体中文繁體中文にほんご
FirendLinks:
AI Newsletters AI ToolsMCP ServersAI NewsAIBaseLLM LeaderboardAI Ranking
© 2026AIBase
Business CooperationSite Map